Introduction

For our project, our domain of interest is to observe the transition of standardized tests being a requirement for admission by universities. Specifically, we have narrowed our scope to analyze the relationship between average SAT scores and their relationship to income across zip codes in California. As an objective of our project, we would like to uncover some of the discepencies from area to area and how it affects students’ score. As college entrance exams, such as the SAT or ACT, were a determining factor in admissions prior to the coronavirus, we wish to uncover if these tests are a fair measure of student academics. Thus, findings from our report will reveal whether standardized test are a true reflection of a student’s capabilities, or are moreso a reflection of the resources and means that students had access to during this time. AThe source of the data is from California Department Of Education and ACS 2010 - 2014 ACS Data.

Summary Information

The summary includes some interesting analysis of how household income corelates to a student’s total SAT score. Based on our data, the mean and median household income for California is 62717 and 56646 respectively.The average SAT score for kids in below average income household was 1364.5 and for kids in above average income household was 1539.3.The zip code with the highest median household income was 95070.

Summary Table

The summary table portrays how total sat score and mean household income are different for different areas(Zip code) in California.

<<<<<<< HEAD
=======
>>>>>>> 407a1eb21e8bc7250dd91567e9d70a88beb7a922

From the table, we can clearly observe that zip codes in the beggining of the list are the areas with way less than average household income and low sat scores whereas zip codes towards the end of the list are the areas with way more average household incomes and higher sat scores.

First Visualization

The purpose of this visualization is to highlight the potential relationship between geographical locations and the average SAT performace. Using map visualization, we can get a better insight on location vs. SAT performance. Here is the visualization:

<<<<<<< HEAD
=======
>>>>>>> 407a1eb21e8bc7250dd91567e9d70a88beb7a922

From the map visualization we can see there seems to be a higher percentage of student passing the performance benchmark near San Francisco and Sacramento than other places. Students near Fresno seem to have lower performance than others.

Second Visualization

The purpose of this visualization is to compare HS average SAT score and Median income by zip code. In particular, the key displays bar colors by zip code.

<<<<<<< HEAD
=======
>>>>>>> 407a1eb21e8bc7250dd91567e9d70a88beb7a922

From the bar chart we observe an upward trend as SAT score and median income increase. Towards the 1500 range in SAT score, not only do we see a concentration of data but the color key reflects that many of the zip codes are 955xx which is Humboldt county. Towards the upper boundary of highest SAT scores we see more 945xx zip codes which is Alameda county in the Bay area. This further confirms what we see from our map visualization.

Third Visualization

The purpose of this visualization is to plot the HS average SAT score versus Median income by zip code, to show any linear or nonlinear relationships. We have a trend line to highlight how income and SAT scores are related.

<<<<<<< HEAD
=======
>>>>>>> 407a1eb21e8bc7250dd91567e9d70a88beb7a922

From this scatterplot, we observe a linear relationship between SAT scores and income. We can easily follow the trendline to see a trend up in SAT scores as median zip code income increases. The correlation between the two is 0.61. Based on the trendline, there seemed to be several high schools that outperformed their expected SAT score based on their zip code median income, as well as some underperforming richer high schools. To the top right of the graph, schools that are well above the trendline shows the schools despite having access to more resources, performed about as well with other schools with less resources.